18 research outputs found

    Online learning of personalised human activity recognition models from user-provided annotations

    Get PDF
    PhD ThesisIn Human Activity Recognition (HAR), supervised and semi-supervised training are important tools for devising parametric activity models. For the best modelling performance, large amounts of annotated personalised sample data are typically required. Annotating often represents the bottleneck in the overall modelling process as it usually involves retrospective analysis of experimental ground truth, like video footage. These approaches typically neglect that prospective users of HAR systems are themselves key sources of ground truth for their own activities. This research therefore involves the users of HAR monitors in the annotation process. The process relies solely on users' short term memory and engages with them to parsimoniously provide annotations for their own activities as they unfold. E ects of user input are optimised by using Online Active Learning (OAL) to identify the most critical annotations which are expected to lead to highly optimal HAR model performance gains. Personalised HAR models are trained from user-provided annotations as part of the evaluation, focusing mainly on objective model accuracy. The OAL approach is contrasted with Random Selection (RS) { a naive method which makes uninformed annotation requests. A range of simulation-based annotation scenarios demonstrate that using OAL brings bene ts in terms of HAR model performance over RS. Additionally, a mobile application is implemented and deployed in a naturalistic context to collect annotations from a panel of human participants. The deployment is proof that the method can truly run in online mode and it also shows that considerable HAR model performance gains can be registered even under realistic conditions. The ndings from this research point to the conclusion that online learning from userprovided annotations is a valid solution to the problem of constructing personalised HAR models

    Tracking Dengue Epidemics using Twitter Content Classification and Topic Modelling

    Full text link
    Detecting and preventing outbreaks of mosquito-borne diseases such as Dengue and Zika in Brasil and other tropical regions has long been a priority for governments in affected areas. Streaming social media content, such as Twitter, is increasingly being used for health vigilance applications such as flu detection. However, previous work has not addressed the complexity of drastic seasonal changes on Twitter content across multiple epidemic outbreaks. In order to address this gap, this paper contrasts two complementary approaches to detecting Twitter content that is relevant for Dengue outbreak detection, namely supervised classification and unsupervised clustering using topic modelling. Each approach has benefits and shortcomings. Our classifier achieves a prediction accuracy of about 80\% based on a small training set of about 1,000 instances, but the need for manual annotation makes it hard to track seasonal changes in the nature of the epidemics, such as the emergence of new types of virus in certain geographical locations. In contrast, LDA-based topic modelling scales well, generating cohesive and well-separated clusters from larger samples. While clusters can be easily re-generated following changes in epidemics, however, this approach makes it hard to clearly segregate relevant tweets into well-defined clusters.Comment: Procs. SoWeMine - co-located with ICWE 2016. 2016, Lugano, Switzerlan

    Predicting the execution time of workflow activities based on their input features

    No full text
    The ability to accurately estimate the execution time of computationally expensive e-science algorithms enables better scheduling of workflows that incorporate those algorithms as their building blocks, and may give users an insight into the expected cost of workflow execution on cloud resources. When a large history of past runs can be observed, crude estimates such as the average execution time can easily be provided. We make the hypothesis that, for some algorithms, better estimates can be obtained by using the histories to learn regression models that predict execution time based on selected features of their inputs. We refer to this property as input predictability of algorithms. We are motivated by e-science workflows that involve repetitive training of multiple learning models. Thus, we verify our hypothesis on the specific case of the C4.5 decision tree builder, a well-known learning method whose training execution time is indeed sensitive to the specific input dataset, but in non- obvious ways. We use the case study to demonstrate a method for assessing input predictability. While this yields promising results, we also find that its more general applicability involves a trade off between the black-box nature of the algorithms under analysis, and the need for expert insight into relevant features of their inputs.</p

    Predicting the execution time of workflow activities based on their input features

    No full text
    The ability to accurately estimate the execution time of computationally expensive e-science algorithms enables better scheduling of workflows that incorporate those algorithms as their building blocks, and may give users an insight into the expected cost of workflow execution on cloud resources. When a large history of past runs can be observed, crude estimates such as the average execution time can easily be provided. We make the hypothesis that, for some algorithms, better estimates can be obtained by using the histories to learn regression models that predict execution time based on selected features of their inputs. We refer to this property as input predictability of algorithms. We are motivated by e-science workflows that involve repetitive training of multiple learning models. Thus, we verify our hypothesis on the specific case of the C4.5 decision tree builder, a well-known learning method whose training execution time is indeed sensitive to the specific input dataset, but in non- obvious ways. We use the case study to demonstrate a method for assessing input predictability. While this yields promising results, we also find that its more general applicability involves a trade off between the black-box nature of the algorithms under analysis, and the need for expert insight into relevant features of their inputs.</p

    Predicting the Execution Time of Workflow Blocks Based on Their Input Features

    No full text
    Abstract—The ability to accurately estimate the execution time of computationally expensive e-science algorithms enables better scheduling of workflows that incorporate those algorithms as their building blocks, and may give users an insight into the expected cost of workflow execution on cloud resources. When a large history of past runs can be observed, crude estimates such as the average execution time can easily be provided. We make the hypothesis that, for some algorithms, better estimates can be obtained by using the histories to learn regression models that predict execution time based on selected features of their inputs. We refer to this property as input predictability of algorithms. We are motivated by e-science workflows that involve repetitive training of multiple learning models. Thus, we verify our hypothesis on the specific case of the C4.5 decision tree builder, a well-known learning method whose training execution time is indeed sensitive to the specific input dataset, but in nonobvious ways. We use the case study to demonstrate a method for assessing input predictability. While this yields promising results, we also find that its more general applicability involves a trade off between the black-box nature of the algorithms under analysis, and the need for expert insight into relevant features of their inputs. I

    Bootstrapping personalised Human Activity Recognition models using Online Active Learning

    No full text
    In Human Activity Recognition (HAR) supervised and semi-supervised training are important tools for devising parametric activity models. For the best modelling performance, typically large amounts of annotated sample data are required. Annotating often represents the bottleneck in the overall modelling process as it usually involves retrospective analysis of experimental ground truth, like video footage. These approaches typically neglect that prospective users of HAR systems are themselves key sources of ground truth for their own activities. We therefore propose an Online Active Learning framework to collect user-provided annotations and to bootstrap personalized human activity models. We evaluate our framework on existing benchmark datasets and demonstrate how it outperforms standard, more naive annotation methods. Furthermore, we enact a user study where participants provide annotations using a mobile app that implements our framework. We show that Online Active Learning is a viable method to bootstrap personalized models especially in live situations without expert supervision.</p

    Bootstrapping personalised Human Activity Recognition models using Online Active Learning

    No full text
    In Human Activity Recognition (HAR) supervised and semi-supervised training are important tools for devising parametric activity models. For the best modelling performance, typically large amounts of annotated sample data are required. Annotating often represents the bottleneck in the overall modelling process as it usually involves retrospective analysis of experimental ground truth, like video footage. These approaches typically neglect that prospective users of HAR systems are themselves key sources of ground truth for their own activities. We therefore propose an Online Active Learning framework to collect user-provided annotations and to bootstrap personalized human activity models. We evaluate our framework on existing benchmark datasets and demonstrate how it outperforms standard, more naive annotation methods. Furthermore, we enact a user study where participants provide annotations using a mobile app that implements our framework. We show that Online Active Learning is a viable method to bootstrap personalized models especially in live situations without expert supervision.</p

    On strategies for budget-based online annotation in human activity recognition

    No full text
    Bootstrapping activity recognition systems in ubiquitous and mobile computing scenarios often comes with the challenge of obtaining reliable ground truth annotations. A promising approach to overcome these difficulties involves obtaining online activity annotations directly from users. However, such direct engagement has its limitations as users typically show only limited tolerance for unwanted interruptions such as prompts for annotations. In this paper we explore the effectiveness of approaches to online, user-based annotation of activity data. Our central assumption is the existence of a fixed, limited budget of annotations a user is willing to provide. We evaluate different strategies on how to spend such a budget most effectively. Using the Opportunity benchmark we simulate online annotation scenarios for a variety of budget configurations and we show that effective online annotation can still be achieved using reduced annotation effort.</p

    On strategies for budget-based online annotation in human activity recognition

    No full text
    Bootstrapping activity recognition systems in ubiquitous and mobile computing scenarios often comes with the challenge of obtaining reliable ground truth annotations. A promising approach to overcome these difficulties involves obtaining online activity annotations directly from users. However, such direct engagement has its limitations as users typically show only limited tolerance for unwanted interruptions such as prompts for annotations. In this paper we explore the effectiveness of approaches to online, user-based annotation of activity data. Our central assumption is the existence of a fixed, limited budget of annotations a user is willing to provide. We evaluate different strategies on how to spend such a budget most effectively. Using the Opportunity benchmark we simulate online annotation scenarios for a variety of budget configurations and we show that effective online annotation can still be achieved using reduced annotation effort.</p
    corecore